{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "64d2e4a0",
   "metadata": {},
   "source": [
    "# End of Week 1 Exercise\n",
    "\n",
    "To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,\n",
    "and responds with an explanation. This is a tool that you will be able to use yourself during the course!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "e62b915e",
   "metadata": {},
   "outputs": [],
   "source": [
    "from openai import OpenAI\n",
    "import ollama\n",
    "from dotenv import load_dotenv\n",
    "import os\n",
    "from IPython.display import display, update_display, Markdown"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "8bdfc47a",
   "metadata": {},
   "outputs": [],
   "source": [
    "MODEL_GPT = 'gpt-4o-mini'\n",
    "MODEL_LLAMA = 'llama3'\n",
    "load_dotenv()\n",
    "\n",
    "api_key = os.getenv('OPENAI_API_KEY')\n",
    "\n",
    "openai=OpenAI()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "57983d03",
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_messages(prompt=\"Describe some of the business applications of Generative AI\"):\n",
    "    \"\"\"Create properly formatted messages for API calls\"\"\"\n",
    "    messages = [\n",
    "        {\n",
    "            \"role\": \"system\",\n",
    "            \"content\": \"You are a helpful technical assistant that provides clear, detailed explanations for technical questions.\"\n",
    "        },\n",
    "        {\"role\": \"user\", \"content\": prompt}\n",
    "    ]\n",
    "    return messages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "a6bcb94d",
   "metadata": {},
   "outputs": [],
   "source": [
    "def answer_with_openai(prompt=\"Describe some of the business applications of Generative AI\"):\n",
    "    \"\"\"Get answer using OpenAI API and print in stream\"\"\"\n",
    "    try:\n",
    "        messages = create_messages(prompt)\n",
    "        stream = openai.chat.completions.create(\n",
    "            model=MODEL_GPT,\n",
    "            messages=messages,\n",
    "            temperature=0.7,\n",
    "            stream=True\n",
    "        )\n",
    "        answer = \"\"\n",
    "        display_handle = display(Markdown(\"\"), display_id=True)\n",
    "        for chunk in stream:\n",
    "            if chunk.choices[0].delta.content:\n",
    "                answer += chunk.choices[0].delta.content\n",
    "                # Clean up markdown formatting for display\n",
    "                clean_answer = answer.replace(\"```\", \"\").replace(\"markdown\", \"\")\n",
    "                update_display(Markdown(clean_answer), display_id=display_handle.display_id)\n",
    "        return answer\n",
    "    except Exception as e:\n",
    "        return f\"Error with OpenAI: {str(e)}\"\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "e96159ab",
   "metadata": {},
   "outputs": [],
   "source": [
    "def answer_with_ollama(prompt=\"Describe some of the business applications of Generative AI\"):\n",
    "    \"\"\"Get answer using Ollama API and print in stream\"\"\"\n",
    "    try:\n",
    "        messages = create_messages(prompt)\n",
    "        stream = ollama.chat(\n",
    "            model=MODEL_LLAMA,\n",
    "            messages=messages,\n",
    "            stream=True\n",
    "        )\n",
    "        answer = \"\"\n",
    "        display_handle = display(Markdown(\"\"), display_id=True)\n",
    "        for chunk in stream:\n",
    "            if chunk['message']['content']:\n",
    "                answer += chunk['message']['content']\n",
    "                # Clean up markdown formatting for display\n",
    "                clean_answer = answer.replace(\"```\", \"\").replace(\"markdown\", \"\")\n",
    "                update_display(Markdown(clean_answer), display_id=display_handle.display_id)\n",
    "        return answer\n",
    "    except Exception as e:\n",
    "        return f\"Error with Ollama: {str(e)}\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "ab72f8b6",
   "metadata": {},
   "outputs": [],
   "source": [
    "def technical_qa_tool(question, use_openai=True, use_ollama=True):\n",
    "    \"\"\"Main function to get technical explanations from both APIs\"\"\"\n",
    "    print(f\"Question: {question}\")\n",
    "    print(\"=\" * 80)\n",
    "    \n",
    "    if use_openai:\n",
    "        print(\"\\n🤖 OpenAI Response:\")\n",
    "        print(\"-\" * 40)\n",
    "        answer_with_openai(question)\n",
    "    \n",
    "    if use_ollama:\n",
    "        print(\"\\n🦙 Ollama Response:\")\n",
    "        print(\"-\" * 40)\n",
    "        answer_with_ollama(question)\n",
    "        # display(Markdown(ollama_answer))\n",
    "    \n",
    "    print(\"\\n\" + \"=\" * 80)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "1a6aa4a2",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Question: What is the difference between supervised and unsupervised machine learning?\n",
      "================================================================================\n",
      "\n",
      "🤖 OpenAI Response:\n",
      "----------------------------------------\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "Supervised and unsupervised machine learning are two primary categories of machine learning techniques, and they differ mainly in how they learn from data and the type of problems they are used to solve. Here’s a detailed explanation of each:\n",
       "\n",
       "### Supervised Machine Learning\n",
       "\n",
       "**Definition**: In supervised learning, the model is trained on a labeled dataset, meaning that each training example is paired with an output label. The goal is to learn a mapping from inputs (features) to the output labels.\n",
       "\n",
       "**Characteristics**:\n",
       "- **Labeled Data**: Requires a dataset that includes both the input features and the corresponding output labels.\n",
       "- **Objective**: The objective is to predict the output for new, unseen data based on the learned mapping from the training data.\n",
       "- **Common Techniques**:\n",
       "  - **Regression**: For predicting continuous values (e.g., predicting house prices).\n",
       "  - **Classification**: For predicting discrete labels (e.g., spam detection in emails).\n",
       "- **Examples**:\n",
       "  - Predicting whether an email is spam or not based on various features (classification).\n",
       "  - Forecasting sales figures based on historical sales data (regression).\n",
       "\n",
       "### Unsupervised Machine Learning\n",
       "\n",
       "**Definition**: In unsupervised learning, the model is trained on data that is not labeled, meaning that it does not have predefined output labels. The goal is to discover patterns, groupings, or structures within the data.\n",
       "\n",
       "**Characteristics**:\n",
       "- **Unlabeled Data**: Works with datasets that only have input features without any associated output labels.\n",
       "- **Objective**: The objective is to explore the data and find hidden patterns or intrinsic structures without specific guidance.\n",
       "- **Common Techniques**:\n",
       "  - **Clustering**: Grouping similar data points together (e.g., customer segmentation).\n",
       "  - **Dimensionality Reduction**: Reducing the number of features while retaining essential information (e.g., PCA - Principal Component Analysis).\n",
       "- **Examples**:\n",
       "  - Grouping customers into segments based on purchasing behavior (clustering).\n",
       "  - Reducing the dimensionality of a dataset to visualize it in two or three dimensions (dimensionality reduction).\n",
       "\n",
       "### Key Differences\n",
       "\n",
       "1. **Data Type**:\n",
       "   - Supervised Learning: Requires labeled data.\n",
       "   - Unsupervised Learning: Works with unlabeled data.\n",
       "\n",
       "2. **Goal**:\n",
       "   - Supervised Learning: To learn a function that maps inputs to the correct outputs.\n",
       "   - Unsupervised Learning: To identify patterns or groupings in the input data.\n",
       "\n",
       "3. **Applications**:\n",
       "   - Supervised Learning: Typically used in scenarios where past data with known outcomes is available (e.g., fraud detection, image classification).\n",
       "   - Unsupervised Learning: Used for exploratory data analysis or when the outcome is not known (e.g., market basket analysis, anomaly detection).\n",
       "\n",
       "In summary, the primary difference between supervised and unsupervised machine learning lies in the presence or absence of labeled data and the objectives of the learning process. Supervised learning aims to predict outcomes based on existing labels, while unsupervised learning seeks to identify hidden structures in data without predefined labels."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "🦙 Ollama Response:\n",
      "----------------------------------------\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "In machine learning, there are two main categories: supervised and unsupervised learning. The key difference lies in the type of data used to train the model and the goal of the learning process.\n",
       "\n",
       "**Supervised Learning**\n",
       "\n",
       "In supervised learning, you have a labeled dataset that contains both input data (features) and corresponding output labels or target variables. The goal is to learn a mapping between the input data and the output labels so that the model can make accurate predictions on new, unseen data.\n",
       "\n",
       "Here are some characteristics of supervised learning:\n",
       "\n",
       "1. Labeled training data: You have a dataset with input data and corresponding output labels.\n",
       "2. Specific goal: You want to predict the output label for a given input instance.\n",
       "3. Model evaluation: You evaluate the performance of your model using metrics like accuracy, precision, recall, F1 score, etc.\n",
       "\n",
       "Examples of supervised learning tasks include:\n",
       "\n",
       "* Image classification (e.g., recognizing dogs vs. cats)\n",
       "* Sentiment analysis (e.g., determining if text is positive or negative)\n",
       "* Regression problems (e.g., predicting house prices based on features like number of bedrooms and square footage)\n",
       "\n",
       "**Unsupervised Learning**\n",
       "\n",
       "In unsupervised learning, you have an unlabeled dataset, and the goal is to discover patterns, relationships, or structure in the data without a specific target variable. This type of learning is often used for exploratory data analysis, feature selection, and dimensionality reduction.\n",
       "\n",
       "Here are some characteristics of unsupervised learning:\n",
       "\n",
       "1. Unlabeled training data: You have a dataset with only input features (no output labels).\n",
       "2. No specific goal: You want to find interesting patterns or structure in the data.\n",
       "3. Model evaluation: You evaluate the performance of your model using metrics like silhouette score, Calinski-Harabasz index, etc.\n",
       "\n",
       "Examples of unsupervised learning tasks include:\n",
       "\n",
       "* Clustering (e.g., grouping customers based on their purchase history)\n",
       "* Dimensionality reduction (e.g., reducing the number of features in a dataset while preserving important information)\n",
       "* Anomaly detection (e.g., identifying unusual behavior or outliers in financial transactions)\n",
       "\n",
       "In summary, supervised learning involves training a model to make predictions based on labeled data, whereas unsupervised learning aims to discover patterns and relationships in unlabeled data."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "================================================================================\n"
     ]
    }
   ],
   "source": [
    "# Test the tool with a technical question\n",
    "technical_question = \"What is the difference between supervised and unsupervised machine learning?\"\n",
    "technical_qa_tool(technical_question)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0a976ce1",
   "metadata": {
    "vscode": {
     "languageId": "plaintext"
    }
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9b0a539e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Interactive version - uncomment to use\n",
    "# user_question = input(\"Enter your technical question: \")\n",
    "# technical_qa_tool(user_question)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
